NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

The Essentials of AI for Life and Society: An AI Literacy Course for the University Community

https://doi.org/10.1609/aaai.v39i28.35166

Biswas, Joydeep; Fussell, Don; Stone, Peter; Patterson, Kristin; Procko, Kristen; Sabatini, Lea; Xu, Zifan (April 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

We describe the development of a one-credit course to promote AI literacy at the University of Texas at Austin. In response to a call for the rapid deployment of class that would serve a broad audience in Fall of 2023, we designed a 14-week seminar-style course that incorporated an interdisciplinary group of speakers who lectured on topics ranging from the fundamentals of AI to societal concerns including disinformation and employment. University students, faculty, and staff, and even community members outside of the University were invited to enroll in this online offering: The Essentials of AI for Life and Society. We collected feedback from course participants through weekly reflections and a final survey. Satisfyingly, we found that attendees reported gains in their AI literacy. We sought critical feedback through quantitative and qualitative analysis, which uncovered challenges in designing a course for this general audience. We utilized the course feedback to design a three-credit version of the course that is being offered in Fall of 2024. The lessons we learned and our plans for this new iteration may serve as a guide to instructors designing AI courses for a broad audience.
more » « less
Free, publicly-accessible full text available April 11, 2026
A Fast Work-Efficient SSSP Algorithm for GPUs

https://doi.org/10.1145/3437801.3441605

Wang, Kai; Fussell, Don; Lin, Calvin (February 2021, Annual Symposium on Principles and Practice of Parallel Programming)

This paper presents a new Single Source Shortest Path (SSSP) algorithm for GPUs. Our key advancement is an improved work scheduler, which is central to the performance of SSSP algorithms. Previous GPU solutions for SSSP use simple work schedulers that can be implemented efficiently on GPUs but that produce low quality schedules. Such solutions yield poor work efficiency and can underutilize the hardware due to a lack of parallelism. Our solution introduces a more sophisticated work scheduler---based on a novel highly parallel approximate priority queue---that produces high quality schedules while being efficiently implementable on GPUs. To evaluate our solution, we use 226 graph inputs from the Lonestar 4.0 benchmark suite and the SuiteSparse Matrix Collection, and we find that our solution outperforms the previous state-of-the-art solution by an average of 2.9×, showing that an efficient work scheduling mechanism can be implemented on GPUs without sacrificing schedule quality. While this paper focuses on the SSSP problem, it has broader implications for the use of GPUs, illustrating that seemingly ill-suited data structures, such as priority queues, can be efficiently implemented for GPUs if we use the proper software structure.
more » « less
Full Text Available
Fast Fine-Grained Global Synchronization on GPUs

https://doi.org/10.1145/3297858.3304055

Wang, Kai; Fussell, Don; Lin, Calvin (April 2019, Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems)

This paper extends the reach of General Purpose GPU programming by presenting a software architecture that supports efficient fine-grained synchronization over global memory. The key idea is to transform global synchronization into global communication so that conflicts are serialized at the thread block level. With this structure, the threads within each thread block can synchronize using low latency, high-bandwidth local scratchpad memory. To enable this architecture, we implement a scalable and efficient message passing library. Using Nvidia GTX 1080 ti GPUs, we evaluate our new software architecture by using it to solve a set of five irregular problems on a variety of workloads. We find that on average, our solutions improve performance over carefully tuned state-of-the-art solutions by 3.6×.
more » « less
Full Text Available

Search for: All records